Measuring Amok Term Paper for CS 224U: Natural Language Understanding
نویسندگان
چکیده
We propose and compare a number of metrics to capture the degree to which words are restricted in the contexts in which they can occur. We re-frame the problem of contextual restrictedness, and introduce the use of vector space models based on syntactic dependencies. We show that our most successful metric, residualized entropy, is quite successful in selecting highly collocationally restricted words, and is predictive of animacy.
منابع مشابه
Towards Measuring Scalability In Natural Language Understanding Tasks
In this paper we present a discussion of existing metrics for evaluation the performance of individual natural language understanding systems and components as well as the commonly employed metrics for measuring the specific task difficulties. We extend and generalize the common majority class baseline metric and introduce an general entropy-based metric for measuring the task difficulty of arb...
متن کاملA Tool for Measuring the Reality of Technology Trends of Interest
In this paper, we present a prototype application – the Technology Trend Tracker – to measure the reality of technology trends of interest using information on the Web to inform decisions such as when to develop training, when to invest in expertise, and more. This prototype performs this task by integrating several artificial intelligence technologies in an innovative way. These technologies i...
متن کاملAn information theoretic approach for using word cluster information in natural language call routing
In this paper, an information theoretic approach for using word clusters in natural language call routing (NLCR) is proposed. This approach utilizes an automatic word class clustering algorithm to generate word classes from the word based training corpus. In our approach, the information gain (IG) based term selection is used to combine both word term and word class information in NLCR. A joint...
متن کاملComputing Science Group CS-RR-10-04
We present a method for automatically creating large-scale semantic networks from natural language text, based on deep semantic analysis. We provide a robust and scalable implementation, and sketch various ways in which the representation may be deployed for conceptual knowledge acquisition. A translation to RDF establishes interoperability with a wide range of standardised tools, and bridges t...
متن کاملMemo CS – 03 – 09
This paper concerns infrastructural work in the fields of Language Engineering, Natural Language Processing and Computational Linguistics. We begin by defining the area of software support for research and development of components in these areas as Software Architecture for Language Engineering (SALE). The rest of the paper reviews contributions to this field, covering a wide range of work ove...
متن کامل